Evaluating Co-authorship Networks in Author Name Disambiguation for Common Names
نویسندگان
چکیده
With the increasing size of digital libraries it has become a challenge to identify author names correctly. The situation becomes more critical when different persons share the same name (homonym problem) or when the names of authors are presented in several different ways (synonym problem). This paper focuses on homonym names in the computer science bibliography DBLP. The goal of this study is to evaluate a method which uses co-authorship networks and analyze the effect of common names on it. For this purpose we clustered the publications of authors with the same name and measured the effectiveness of the method against a gold standard of manually assigned DBLP records. The results show that despite the good performance of implemented method for most names, we should optimize for common names. Hence community detection was employed to optimize the method. Results prove that the applied method improves the performance for these names.
منابع مشابه
A tool for generating synthetic authorship records for evaluating author name disambiguation methods
0020-0255/$ see front matter 2012 Elsevier Inc http://dx.doi.org/10.1016/j.ins.2012.04.022 ⇑ Corresponding author at: Departamento de Ciên E-mail addresses: [email protected] (A.A. F dcc.ufmg.br (A.H.F. Laender), [email protected] 1 Here regarded as a set of bibliographic informati particular article. The author name disambiguation task has to deal with uncertainties related to the possib...
متن کاملOn co-authorship for author disambiguation
Author name disambiguation deals with clustering the same-name authors into different individuals. To attack the problem, many studies have employed a variety of disambiguation features such as coauthors, titles of papers/publications, topics of articles, emails/affiliations, etc. Among these, co-authorship is the most easily accessible and influential, since inter-person acquaintances represen...
متن کاملبهبود صحت ابهامزدایی نام نویسنده با استفاده از خوشهبندی تجمّعی
Today, digital libraries are important academic resources including millions of citations and bibliographic essential information such as titles, author's names and location of publications. From the view of knowledge accumulation management, the ability to search fast, accurate, desired contents, has a great importance. The complexity and similarity in these resources cause many challenges and...
متن کاملAccuracy of simple, initials-based methods for author name disambiguation
There are a number of solutions that perform unsupervised name disambiguation based on the similarity of bibliographic records or common co-authorship patterns. Whether the use of these advanced methods, which are often difficult to implement, is warranted depends on whether the accuracy of the most basic disambiguation methods, which only use the author's last name and initials, is sufficient ...
متن کاملEvaluating the Use of Social Networks in Author Name Disambiguation in Digital Libraries
Digital libraries have become an important source of information for scientific communities. However, by gathering data from different sources, the problem of duplicate and ambiguous information about author names arises. Traditional methods of name disambiguation use syntactic attribute information. However, recently the use of relationship networks has been studied in data deduplication. This...
متن کامل